Finding Photograph Captions Multimodally on the World Wide Web
نویسندگان
چکیده
Several software tools index text of the World Wide Web, but little attention has been paid to the many valuable photographs. We present a relatively simple way to index them by localizing their likely explicit and implicit captions with a kind of expert system. We use multimodal clues from the general appearance of the image, layout of the Web page, and the words nearby the image that are likely to describe it. Our MARIE-3 system avoids full image processing and full natural-language processing, but demonstrates a surprising degree of success, and can thus serve as a preliminary filtering for such detailed content analysis. Experiments with a randomly chosen set of Web pages concerning the military showed 41% recall with 41% precision for individual caption identification, or 70% recall with 30% precision, although captions averaged only 1.4% of the page text.
منابع مشابه
Precise and Efficient Retrieval of Captioned Images: The MARIE Project
THEMARIE PROJECT HAS EXPLORED knowledge-based information retrieval of captioned images of the kind found in picture libraries and on the Internet. It exploits the idea that images are easier to understand with context, especially descriptive text near them, but it also does image analysis. The MARIE approach has five parts: ( 1 ) find the images and captions; (2) parse and interpret the captio...
متن کاملMarie-4: A High-Recall, Self-Improving Web Crawler That Finds Images Using Captions
page text describes associated images, and images are not captioned consistently. Content-based image retrieval systems that analyze the images themselves1 are progressing, but the systems require considerable image-preprocessing time. Furthermore, surveys of users doing image retrieval show that users are more interested in the identification of objects and actions depicted by images than in t...
متن کاملImproving Accessibility of Transaction-centric Web Objects
Advances in web technology have considerably widened the Web accessibility divide between sighted and blind users. This divide is especially acute when conducting online transactions, e.g., shopping, paying bills, making travel plans, etc. Such transactions span multiple web pages and require that users find clickable objects (e.g., “add-to-cart” button) which are essential for transaction prog...
متن کاملUsing Controlled Vocabularies in Automated Subject Classification of Textual Web Pages, in the Context of Browsing
Automated subject classification has been a challenging research issue for several decades now. The purpose of this thesis is to determine to what degree controlled vocabularies that have been traditionally used in libraries could be utilised in automated classification of textual Web pages, in the context of browsing. Usefulness of different characteristics of controlled vocabularies for autom...
متن کاملGenerating Image Captions using Topic Focused Multi-document Summarization
In the near future digital cameras will come standardly equipped with GPS and compass and will automatically add global position and direction information to the metadata of every picture taken. Can we use this information, together with information from geographical information systems and the Web more generally, to caption images automatically? This challenge is being pursued in the TRIPOD pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002